Scalable Adaptation of State Complexity for Nonparametric Hidden Markov Models

نویسندگان

  • Michael C. Hughes
  • William T. Stephenson
  • Erik B. Sudderth
چکیده

Bayesian nonparametric hidden Markov models are typically learned via fixed truncations of the infinite state space or local Monte Carlo proposals that make small changes to the state space. We develop an inference algorithm for the sticky hierarchical Dirichlet process hidden Markov model that scales to big datasets by processing a few sequences at a time yet allows rapid adaptation of the state space cardinality. Unlike previous point-estimate methods, our novel variational bound penalizes redundant or irrelevant states and thus enables optimization of the state space. Our birth proposals use observed data statistics to create useful new states that escape local optima. Merge and delete proposals remove ineffective states to yield simpler models with more affordable future computations. Experiments on speaker diarization, motion capture, and epigenetic chromatin datasets discover models that are more compact, more interpretable, and better aligned to ground truth segmentations than competitors. We have released an open-source Python implementation which can parallelize local inference steps across sequences.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Material : Scalable Adaptation of State Complexity for Nonparametric Hidden Markov Models Paper published at NIPS 2015

A Experiment Details 2 A.1 Toy Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 A.2 Speaker Diarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 A.3 Motion capture dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 A.4 Chromatin epigenomic dataset . . . . . . . . . . . . . . . . . . . . . . ...

متن کامل

Bayesian time series models and scalable inference

With large and growing datasets and complex models, there is an increasing need for scalable Bayesian inference. We describe two lines of work to address this need. In the first part, we develop new algorithms for inference in hierarchical Bayesian time series models based on the hidden Markov model (HMM), hidden semi-Markov model (HSMM), and their Bayesian nonparametric extensions. The HMM is ...

متن کامل

Small-Variance Asymptotics for Hidden Markov Models

Small-variance asymptotics provide an emerging technique for obtaining scalable combinatorial algorithms from rich probabilistic models. We present a smallvariance asymptotic analysis of the Hidden Markov Model and its infinite-state Bayesian nonparametric extension. Starting with the standard HMM, we first derive a “hard” inference algorithm analogous to k-means that arises when particular var...

متن کامل

Consistency of Bayesian nonparametric Hidden Markov Models

We are interested in Bayesian nonparametric Hidden Markov Models. More precisely, we are going to prove the consistency of these models under appropriate conditions on the prior distribution and when the number of states of the Markov Chain is finite and known. Our approach is based on exponential forgetting and usual Bayesian consistency techniques.

متن کامل

Minimax Adaptive Estimation of Nonparametric Hidden Markov Models

We consider stationary hidden Markov models with finite state space and nonparametric modeling of the emission distributions. It has remained unknown until very recently that such models are identifiable. In this paper, we propose a new penalized least-squares estimator for the emission distributions which is statistically optimal and practically tractable. We prove a non asymptotic oracle ineq...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015